A Structured Language Model for Incremental Tree-to-String Translation
نویسندگان
چکیده
Tree-to-string systems have gained significant popularity thanks to their simplicity and efficiency by exploring the source syntax information, but they lack in the target syntax to guarantee the grammaticality of the output. Instead of using complex tree-to-tree models, we integrate a structured language model, a left-to-right shift-reduce parser in specific, into an incremental tree-to-string model, and introduce an efficient grouping and pruning mechanism for this integration. Large-scale experiments on various Chinese-English test sets show that with a reasonable speed our method gains an average improvement of 0.7 points in terms of (Ter-Bleu)/2 than a state-of-the-art tree-to-string system.
منابع مشابه
Efficient Incremental Decoding for Tree-to-String Translation
Syntax-based translation models should in principle be efficient with polynomially-sized search space, but in practice they are often embarassingly slow, partly due to the cost of language model integration. In this paper we borrow from phrase-based decoding the idea to generate a translation incrementally left-to-right, and show that for tree-to-string models, with a clever encoding of derivat...
متن کاملDeep Syntactic Structures for String-to-Tree Translation
1.1 String-to-tree translation A state-of-the-art syntax-based Statistical Machine Translation (SMT) model, string-to-tree translation model (Galley et al., 2004; Galley et al., 2006; Chiang et al., 2009), is to construct a number of parse trees of the target language by ‘parsing’ a source language sentence making use of a bilingual translation grammar. Given a set of parallel sentences for tra...
متن کاملForest-based Tree Sequence to String Translation Model
This paper proposes a forest-based tree sequence to string translation model for syntaxbased statistical machine translation, which automatically learns tree sequence to string translation rules from word-aligned sourceside-parsed bilingual texts. The proposed model leverages on the strengths of both tree sequence-based and forest-based translation models. Therefore, it can not only utilize for...
متن کاملLeft-to-Right Tree-to-String Decoding with Prediction
Decoding algorithms for syntax based machine translation suffer from high computational complexity, a consequence of intersecting a language model with a context free grammar. Left-to-right decoding, which generates the target string in order, can improve decoding efficiency by simplifying the language model evaluation. This paper presents a novel left to right decoding algorithm for tree-to-st...
متن کاملAkamon: An Open Source Toolkit for Tree/Forest-Based Statistical Machine Translation
We describe Akamon, an open source toolkit for tree and forest-based statistical machine translation (Liu et al., 2006; Mi et al., 2008; Mi and Huang, 2008). Akamon implements all of the algorithms required for tree/forestto-string decoding using tree-to-string translation rules: multiple-thread forest-based decoding, n-gram language model integration, beamand cube-pruning, k-best hypotheses ex...
متن کامل